Skip to content

Conversation

@hokein
Copy link
Owner

@hokein hokein commented Jul 8, 2025

No description provided.

hokein and others added 15 commits July 8, 2025 13:29
…46412)

Converting back and forth for the source location raw encoding is
unnecessary.
This patch stops storing a source range in `CXXOperatorCallExpr` and
keeps only the begin location.

This change allows us to retain the optimization llvm#141058 when switching
to 64-bit source locations.

Performance results:

https://llvm-compile-time-tracker.com/compare.php?from=0588e8188c647460b641b09467fe6b13a8d510d5&to=5958f83476a8b8ba97936f262396d3ff98fb1662&stat=instructions:u
fix

Reduce the Stmt size back to 8 bytes.

Reduce the CallExpr size

Fix the ObjCContainerDecl bit field

Change the SourceLocation::UIntTy to uint64_t

Update other SourceManager's getDecomposedSpellingLoc APIs, and fix many
failing tests.

Remaining failures:

  Clang :: Index/IBOutletCollection.m
  Clang :: Index/annotate-macro-args.m
  Clang :: Index/annotate-module.m
  Clang :: Index/annotate-tokens-pp.c
  Clang :: Index/annotate-tokens.c
  Clang :: Index/annotate-toplevel-in-objccontainer.m
  Clang :: Index/hidden-redecls.m
  Clang :: Index/index-module-with-vfs.m
  Clang :: Index/index-module.m
  Clang :: Index/index-pch-objc.m
  Clang :: Index/index-pch-with-module.m
  Clang :: Index/index-pch.cpp
  Clang :: Index/targeted-annotation.c
  Clang :: Lexer/SourceLocationsOverflow.c
  Clang-Unit :: ./AllClangUnitTests/PPMemoryAllocationsTest/PPMacroDefinesAllocations
  Clang-Unit :: ./AllClangUnitTests/SourceLocationEncoding/Individual
  Clang-Unit :: ./AllClangUnitTests/SourceLocationEncoding/Sequence
  Clang-Unit :: libclang/./libclangTests/14/53
  Clang-Unit :: libclang/./libclangTests/45/53
  Clang-Unit :: libclang/./libclangTests/47/53
  Clang-Unit :: libclang/./libclangTests/48/53
  Clang-Unit :: libclang/./libclangTests/49/53
  Clang-Unit :: libclang/./libclangTests/50/53
  Clang-Unit :: libclang/./libclangTests/52/53

Fix libclang failures

Fix Rewrite APIs

Fix PPMemoryAllocationsTest

Fix SourceLocationEncodingTest

More unsigned -> SourceLocation::UIntTy changes in the SourceManager APIs

Update the type of std::pair<FileID, unsigned> in CIndex.cpp

Fix SourceLocationEncodingTest

Tweak the SourceLocation Implementation.

The source location has a Bit which specify the number of bits used
for the offset. 40 by default;

Make MathExtra templates constexpr

Test Bits=64 perf

Try 48 bits

No bitfields

Fix CallExpr optimization.

Test Bits=64 perf

Switch Bits back to 40.

Reduce SubstNonTypeTemplateParmExpr size: 48 -> 40 bytes

Reduce OpaqueValueExpr: 32 -> 24 bytes

Reduce CXXDependentScopeMemberExpr size: 88 -> 80 bytes

Reduce DeclRefExpr size: 48 -> 40 bytes.

by moving out the two source locations for CXXOpName from DeclarationNameLoc

Fix some merge conflicts.

Move the Loc back to the StmtBitFields if possible to save AST size.

Improve getFildIDLocal binary search.

Optimize binary search by using a dedicate offset table

improve the cache performance

Revert the static_assert change for ObjCContainerDeclBitfields.

Fix the compile failures for include-cleaner.

Fix clang-tidy build.

Fix clangd unittest

Fix windows build failures.

unsigned long is 32 bits on MSVC

More windows fix

Change the underlying StmtBitField type to uint64_t, fix windows
failures.

So that the sizeof(Stmt) can stay with 8 bytes.

More window fix

Fix merge failures

Update comments for SourceLocation.

clang-format

revert the Rewrite change.

Don't change the FileIDAndOffset type.

Revert the change in ObjCContainerDeclBitfields

Revert the changei n HTMLReport.cpp

Revert the unsigned -> UIntTy change in Diagnostic.h

Revert the unsigned->UIntTy change in SourceManager.

revert the binary optimization change.

clang-format

More cleanup

Cleanup some unnecessary change.

Get rid of the Range in CXXOperatorCallExpr.

revert unintentional changes.

Remove unintentional change.

Revert unintentional changes.
All deserialized VarDecl initializers are EvaluatedStmt, but not all
EvaluatedStmt initializers are from a PCH. Calling
`VarDecl::hasInitWithSideEffects` can trigger constant evaluation, but
it's hard to know ahead of time whether that will trigger
deserialization - even if the initializer is fully deserialized, it may
contain a call to a constructor whose body is not deserialized. By
caching the result of `VarDecl::hasInitWithSideEffects` and populating
that cache during deserialization we can guarantee that calling it won't
trigger deserialization regardless of the state of the initializer.
This also reduces memory usage by removing the `InitSideEffectVars` set
in `ASTReader`.

rdar://154717930
EvaluateAsInitializer does not support evaluating values with dependent
types. This was previously guarded with a check for the initializer
expression, but it is possible for the VarDecl to have a dependent type
without the initializer having a dependent type, when the initializer is
a specialized template type and the VarDecl has the unspecialized type.
This adds a guard checking for dependence in the VarDecl type as well.
This fixes the issue raised by Google in
llvm#145447
… getFileID (llvm#146782)

`getFileID` is a hot method. By caching the offset range in
`LastFileIDLookup`, we can more quickly check whether a given offset
falls within it, avoiding calling `isOffsetInFileID`.

https://llvm-compile-time-tracker.com/compare.php?from=0588e8188c647460b641b09467fe6b13a8d510d5&to=64843a500f0191b79a8109da9acd7e80d961c7a3&stat=instructions:u
…6604)

The `SLocEntry` structure is 24 bytes, and the binary search only needs
the offset. Loading an entry's offset might pull the entire SLocEntry
object into the CPU cache.

To make the binary search much more cache-efficient, we use a separate
offset table.

See
https://llvm-compile-time-tracker.com/compare.php?from=650d0151c623c123e4e9736fe50421624a329260&to=6af564c0d75aff28a2784a8554448c0679877792&stat=instructions:u.
…vm#148726)

These objects are used as local stack variables during parsing, and they
are not small. This patch reduces their sizes:

* `ParsedAttributesView`: 72 → 40 bytes
* `AttributePool`: 72 → 40 bytes

No negative performance impact has been
[observed](https://llvm-compile-time-tracker.com/compare.php?from=a709621cd545b061782b03136286227867b452a6&to=f50500b3c178e97c0c861301e853e6d5b859040b&stat=instructions:u).

**Context:**
We have some verilator-generated code with extremely deep nesting of
parenthesized expressions, e.g.:

```cpp
bool s = 
(...(bool)(i[0])
 |(bool)(i[1]))
 |(bool)(i[2]))
 | ...
 |(bool)(i[n]));
```

Before this patch, on my local machine, Clang begins emitting
`-Wstack-exhausted` when `n` is 715. After the patch, that threshold
increases to `950`.
The `Declarator` class is large (4584 bytes) and used as a stack-local
variable during parsing.

This patch reduces the default size of its `DeclTypeInfo` member,
reducing the overall size down to 3880 bytes. This allows clang handle
more deeply nested expressions without exhausting the stack.

Combined with llvm#148726, the nesting threshold for such expressions
increases to `~1100`.

No performance impact being
[observed](https://llvm-compile-time-tracker.com/compare.php?from=d4f5ed6a23464cbe831820cb695aa1d39b11e4aa&to=66ba54b8a295cc2759387ef2a4a162de2ad4946e&stat=instructions:u).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants